Biostatistics For Dummies (Monika Wahi John Pezzullo)

less than 4. Having 30 events — which in this case are fatal car accidents — isn’t statistically

significantly different from having 40 events in the same time period. As you see from the result, the

increase of 10 in one year is likely statistical noise. But had the number of events increased more

dramatically — say from 30 to 50 events — the increase would have been statistically significant.

This is because

, which is greater than 4.

Estimating the Required Sample Size

As in all sample-size calculations, you need to specify the desired statistical power and the α level of

the test. Let’s set power to 80 percent and α to 0.05, as these are common settings. When comparing

event rates (

and

) between two groups with

as the reference group, you must also specify:

The expected rate in the reference group (

)

The effect size of importance, expressed as the rate ratio

The expected ratio of exposure in the two groups

For example, suppose that you’re designing a study to test whether rotavirus gastroenteritis has a

higher incidence in City XYZ compared to City ABC. You’ll enroll an equal number City XYZ and

City ABC residents, and follow them for one year to see whether they get rotavirus. Suppose that the

one-year incidence of rotavirus in City XYZ is 1 case per 100 person-years (an incidence rate of 0.01

case per patient-year, or 1 percent per year). You want to have an 80 percent likelihood of getting a

statistically significant result assuming p = 0.05 (you want to set power at 80 percent and α = 0.05).

When comparing the incidence rates, you are only concerned if they differ by more than 25 percent,

which translates to a RR of 1.25. This means you expect to see 0.01 × 1.25 = 0.0125 cases per patient-

year in City ABC.

If you want to use G*Power to do your power calculation (see Chapter 4), under Test family, choose z

tests for population-level tests. Under Statistical test, choose Proportions: Difference between two

independent proportions because the two rates are independent. Under Type of power analysis,

choose A priori: Compute required sample size – given α, power and effect size, and under the Input

Parameters section, choose two tails so you can test if one is higher or lower than the other. Set

Proportion p1 to 0.01 (to represent City XYZ’s incidence rate), Proportion p2 to 0.0125 (to represent

City ABC’s expected incidence rate), α err prob (α) to 0.05, and Power (1-β err prob) (power) to 0.8

for 80 percent, and keep a balanced Allocation ration N2/N1 of 1. After clicking Calculate, you’ll see

you need at least 27,937 person-years of observation in each group, meaning observing 57,000

participants over a one-year study. The shockingly large target sample size illustrates a challenge when

studying incidence rates of rare illnesses.